# Lecture 11: CPU – OoO Execution

Hunjun Lee <a href="hunjunlee@hanyang.ac.kr">hunjunlee@hanyang.ac.kr</a>

# Problem of the existing pipeline ...

No identical operations

e.g., R-type, I-type & J-type

- ⇒ Unify instruction types
  - Combine instruction types to flow through "multi-function" pipe
- ◆ No uniform sub-operations e.g., RF read VS. Memory write?
  - ⇒ Balance pipeline stages
    - Stage-latency calculation to make balanced stages
- No independent operations e.g., Waiting data to be produced?
  - ⇒ Remove dependency and/or busy resources
    - Inter-instruction dependency detection and resolution

# Problem of the existing pipeline ...

BOM SAM THE CYCLE SH CHECK.



- No independent operations e.g., Waiting data to be produced?
  - ⇒ Remove dependency and/or busy resources
    - Inter-instruction dependency detection and resolution



# Multipath Execution

- There are multiple execution stages in reality (separate execution paths, and memory unit)
- Some instructions take longer to finish than others



# Exceptions in multi-cycle execution: Option #1

 Using a multi-path execution, the instructions may terminate out-of-order!



We should not write the result before the prior instruction has completed!

# Exceptions in multi-cycle execution: Option #2

 A single slow-running instruction may delay the younger instructions



# Out-of-order execution: What we want!

 The younger instructions are executed early, but defer WB

OoO Execution → younger instructions finish EX stage earlier



WB in-order

# Problem of the OoO Execution

 The younger instructions are executed early, but defer WB



We need a **temporary storage** to keep the calculated data and forward the data

# Reorder buffer concept

- The instructions are completed out-of-order, but reorder them before changing the architectural state
  - Reserve an entry of the decoded instructions in order
  - Write the results to the ROB upon completion
  - If the oldest entry has completed, write the result to the register file



# The effects of ROB

- An instruction completes in an in-order manner while the execution completes in an out-of-order manner
- But, sometimes we want to use the data in the ROB!



# Reorder buffer implementation

ROB: Circular Queue 32

 It is essentially a circular queue to store temporary values and control precise writebacks



div r1, r4, r2 **IF** 

sub r4, r2, r4

sub r6, r3, r2

mul r5, r2, r5

add r7, r4, r6

# Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 1     | r1   | 200   |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 1     | r4   | 500   |
| 1     | r5   | 600   |
| 1     | r6   | 700   |
| 1     | r7   | 800   |

HEAD/TAIL

| Reorder Buffer (ROB) | * KOD | Evan | 3     | Prove  | 2:\ WB & 3 | ָלֻ<br>ע |
|----------------------|-------|------|-------|--------|------------|----------|
|                      | Rec   | rdei | r Búf | fer (R | OB)        |          |

| Val | id | Dst Name | Value | Ready |
|-----|----|----------|-------|-------|
| 0   |    | ı        | -     | 0     |
| 0   |    | 1        | -     | 0     |
| 0   |    | 1        | -     | 0     |
| 0   |    | 1        | -     | 0     |
| 0   |    | 1        | -     | 0     |
| 0   |    | ı        | -     | 0     |
| 0   |    | -        | -     | 0     |
| 0   |    | -        | -     | 0     |

# Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 0     | r1   | 200   |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 1     | r4   | 500   |
| 1     | r5   | 600   |
| 1     | r6   | 700   |
| 1     | r7   | 800   |

| <b>IDONA</b> | Destruation | Report 28th 24 | EARLY SILF. | Buffer |       |
|--------------|-------------|----------------|-------------|--------|-------|
|              | 4           | Reor           | uer i       | Burrer | (RUB) |

|      | Valid | Dst Name | Value | Ready |
|------|-------|----------|-------|-------|
| HEAD | 1     | r1       | 1     | 0     |
| TAIL | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |

## Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 0     | r1   | 200   |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 0     | r4   | 500   |
| 1     | r5   | 600   |
| 1     | r6   | 700   |
| 1     | r7   | 800   |

|      | Valid | Dst Name | Value | Ready |
|------|-------|----------|-------|-------|
| HEAD | 1     | r1       | -     | 0     |
|      | 1     | r4       | _     | 0     |
| TAIL | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | _        | -     | 0     |

## Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 0     | r1   | 200   |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 0     | r4   | 500   |
| 1     | r5   | 600   |
| 0     | r6   | 700   |
| 1     | r7   | 800   |

|      | Valid | Dst Name | Value | Ready |
|------|-------|----------|-------|-------|
| HEAD | 1     | r1       | -     | 0     |
|      | 1     | r4       | -     | 0     |
|      | 1     | r6       | 1     | 0     |
| TAIL | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | _        | -     | 0     |



## Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 0     | r1   | 200   |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 0     | r4   | 500   |
| 0     | r5   | 600   |
| 0     | r6   | 700   |
| 1     | r7   | 800   |

|      | Valid | Dst Name | Value | Ready |
|------|-------|----------|-------|-------|
| HEAD | 1     | r1       | -     | 0     |
|      | 1     | r4       | -200  | 1     |
|      | 1     | r6       | 1     | 0     |
|      | 1     | r5       | 1     | 0     |
| TAIL | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |



## Register File (RF)

| Valid Name Va |    | Value |
|---------------|----|-------|
| 1             | r0 | 100   |
| 0             | r1 | 200   |
| 1             | r2 | 300   |
| 1             | r3 | 400   |
| 0             | r4 | 500   |
| 0             | r5 | 600   |
| 0             | r6 | 700   |
| 0             | r7 | 800   |

|      | Valid | Dst Name | Value | Ready |
|------|-------|----------|-------|-------|
| HEAD | 1     | r1       | 1     | 0     |
|      | 1     | r4       | -200  | 1     |
|      | 1     | r6       | 100   | 1     |
|      | 1     | r5       | -     | 0     |
|      | 1     | r7       |       | 0     |
| TAIL | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | _        | -     | 0     |

## Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 0     | r1   | 200   |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 0     | r4   | 500   |
| 0     | r5   | 600   |
| 0     | r6   | 700   |
| 0     | r7   | 800   |

|      | Valid | Dst Name | Value | Ready |
|------|-------|----------|-------|-------|
| HEAD | 1     | r1       | 1     | 0     |
|      | 1     | r4       | -200  | 1     |
|      | 1     | r6       | 100   | 1     |
|      | 1     | r5       | -     | 0     |
|      | 1     | r7       | -100  |       |
| TAIL | 0     | -        | -     | 0     |
|      | 0     | _        | -     | 0     |
|      | 0     | -        | -     | 0     |

## Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 0     | r1   | 200   |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 0     | r4   | 500   |
| 0     | r5   | 600   |
| 0     | r6   | 700   |
| 0     | r7   | 800   |

|      | Valid | Dst Name | Value | Ready |
|------|-------|----------|-------|-------|
| HEAD | 1     | r1       | 2     | 1     |
|      | 1     | r4       | -200  | 1     |
|      | 1     | r6       | 100   | 1     |
|      | 1     | r5       | -     | 0     |
|      | 1     | r7       | -100  | 0     |
| TAIL | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | _        | -     | 0     |

div r1, r4, r2 IF sub r4, r2, r4 sub r6, r3, r2 mul r5, r2, r5 |EX||EX||EX||EX| add r7, r4, r6

### Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 1     | r1   | 2     |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 0     | r4   | 500   |
| 0     | r5   | 600   |
| 0     | r6   | 700   |
| 0     | r7   | 800   |

BO CHATH

|             | Valid | Dst Name | Value | Ready |
|-------------|-------|----------|-------|-------|
|             | 0     | -        |       | 0     |
| <i>IEAD</i> | 1     | r4       | -200  | 1     |
|             | 1     | r6       | 100   | 1     |
|             | 1     | r5       | -     | 0     |
|             | 1     | r7       | -100  | 0     |
| TAIL        | 0     | -        | -     | 0     |
|             | 0     | _        | -     | 0     |
|             | 0     | -        | -     | 0     |

### Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 1     | r1   | 2     |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 1     | r4   | -200  |
| 0     | r5   | 600   |
| 0     | r6   | 700   |
| 0     | r7   | 800   |

|      | Valid | Dst Name | Value | Ready |
|------|-------|----------|-------|-------|
|      | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
| HEAD | 1     | r6       | 100   | 1     |
|      | 1     | r5       | -     | 0     |
|      | 1     | r7       | -100  | 0     |
| TAIL | 0     | -        | -     | 0     |
|      | 0     | -        | -     | 0     |
|      | 0     | _        | -     | 0     |

## Register File (RF)

| Valid | Name | Value |
|-------|------|-------|
| 1     | r0   | 100   |
| 1     | r1   | 2     |
| 1     | r2   | 300   |
| 1     | r3   | 400   |
| 1     | r4   | -200  |
| 0     | r5   | 600   |
| 1     | r6   | 100   |
| 0     | r7   | 800   |

|      | Valid | Dst Name | Value  | Ready |
|------|-------|----------|--------|-------|
|      | 0     | -        | -      | 0     |
|      | 0     | -        | -      | 0     |
|      | 0     | -        | •      | 0     |
| HEAD | 1     | r5       | 180000 | 1     |
|      | 1     | r7       | -100   | 0     |
| TAIL | 0     | -        | -      | 0     |
|      | 0     | -        | -      | 0     |
|      | 0     | -        | -      | 0     |

# How to implement this? Option #1: Using CAM

- ROB utilizes a content addressable memory to search for entry (with the target dst reg)
  - You do not need to iterate over the buffers to check if the ROB entry keeps the operand!



How would you handle multiple matches?

# How to implement this? Option #2: Using Indirection

You can write down the entry address in the register file!

### Register File (RF)

### Reorder Buffer (ROB)

| Valid | Name | Value | TAG  |      | Valid | Dst Name | Value  | Ready |
|-------|------|-------|------|------|-------|----------|--------|-------|
| 1     | r0   | 100   | -    |      | 0     | -        | -      | 0     |
| 1     | r1   | 2     | -    |      | 0     | -        | •      | 0     |
| 1     | r2   | 300   | -    |      | 0     | -        | -      | 0     |
| 1     | r3   | 400   | -    | HEAD | 1     | r5       | 100000 | 1     |
| 1     | r4   | -200  | -    |      | 1     | r7       | -100   | 0     |
| 0     | r5   | 600   | ROB3 | TAIL | 0     | -        | -      | 0     |
| 1     | r6   | 100   |      |      | 0     | -        | -      | 0     |
| 0     | r7   | 800   | ROB4 |      | 0     | _        | -      | 0     |
|       |      |       |      | T    |       |          |        |       |

Architectural RF 5766 %

You can think of this as renaming:

the source operand can either be in (1) RF or (2) ROB

# Renaming Result!



We do not read from the register file, but rather from the renamed entries @ ROB

# Hardware register renaming



- Renaming maintains bindings from arch reg names to uarch reg names<sub>▷R</sub>
  - Compiler does not know about uarch rename registers.
- When issuing an instruction that updates 'architecture' register 'rd':
  - Allocate an unused rename 'physical' register 'px'
  - Record binding from 'rd' to 'px'
- When to remove a binding? When to de-allocate a rename register?

# Out-of-order execution (runtime scheduling!)

- After renaming, the WAW and WAR dependencies are almost eliminated
- If there is a true dependency (RAW) for a decoded instruction, the instruction cannot be executed (stalled)
- While a previous instruction is stalled, the later instructions can be dispatched to the EX/MEM units (unless there is a true dependency)

# order>

$$r1 \Leftarrow r2 * 3$$

$$r3 \Leftarrow r1 / 17$$

$$r4 \Leftarrow r0 \Rightarrow r3$$

$$r3 \Leftarrow r12 + 1$$

$$r12 \Leftarrow r3 / 17$$

$$r4 \Leftarrow r0 \Rightarrow r20$$

# → Maw, was string >

p14 ← p0 - p20

# Out-of-order execution (runtime scheduling!)

### order>

$$\begin{array}{c}
 r1 & \Leftarrow & r2 & * & 3 \\
 r3 & \rightleftharpoons & r1 & / & 17 \\
 r4 & \rightleftharpoons & r0 & \Rightarrow & r3 \\
 \hline
 r3 & \rightleftharpoons & r12 & + & 1 \\
 r12 & \rightleftharpoons & r3 & / & 17 \\
 r4 & \rightleftharpoons & r0 & - & r20
 \end{array}$$

### <renaming>

### <000 ex>

```
p1 ← p2 * 3

p5 ← p12 + 1

p3 ← p1 / 17

p13 ← p5 / 17

p4 ← p0 - p3

p14 ← p0 - p20
```

- If dep. distance > issue distance, even RAW is eliminated
- With OoO, superscalar gets much better!
- But OoO is an microarchitectural feature (not exposed to the programmer and compiler)

# Reorder buffer concept

- The instructions are
  - (1) fetched / decoded + dispatched to the execution units in order
  - (2) executed **out-of-order** (a newer instruction may complete execution earlier)
  - (3) written back to the register file **in-order**



# Potential improvement of reordering

- add/sub utilizes the same unit and takes three cycles
- mul utilizes a different unit and takes eight cycles



It seems that mul r6, r2, r7 becomes the bottleneck... (Can we do better?)

# Out-of-order dispatch

- → हिस्तार प्रमुक्त क्री होते
- We can dispatch mul r6, r2, r7 before dispatching add r5, r6, r5
  - Can hide the executing latency!



The decoded data @ add r5, r4, r6 are overwritten @ mul r6, r2, r7

- We need to extend the existing architecture to support runtime instruction scheduling
  - 1) Rename register online to mitigate false dependency
  - 2) Decouple fetch and execution using an issue buffer



- We need to extend the existing architecture to support runtime instruction scheduling
  - 1) Rename register online to mitigate false dependency
  - 2) Decouple fetch and execution using an issue buffer



- We need to extend the existing architecture to support runtime instruction scheduling
  - 1) Rename register online to mitigate false dependency
  - 2) Decouple fetch and execution using an issue buffer



- We need to extend the existing architecture to support runtime instruction scheduling
  - 1) Rename register online to mitigate false dependency
  - 2) Decouple fetch and execution using an issue buffer



# OoO CPU design

- We need to extend the existing architecture to support runtime instruction scheduling
  - 1) Rename register online to mitigate false dependency
  - 2) Decouple fetch and execution using an issue buffer



# Register renaming implementation: Tomasulo's algorithm

- A physical register is a combination of architectural register + ROB entries
  - The value can either be in the ROB or architectural reg file
  - There is a table to indicate where the data resides
- Let's start with a reservation station + register file w/o
   ROB (we'll get to ROB later on ...)

#### Register File (RF)

| Valid | Name | Value | TAG |
|-------|------|-------|-----|
| 1     | r0   | 0     | -   |
| 1     | r1   | 1     | -   |
| 1     | r2   | 2     | -   |
| 1     | r3   | 3     | -   |
| 1     | r4   | 4     | -   |
| 1     | r5   | 5     | -   |
| 1     | r6   | 6     | -   |
| 1     | r7   | 7     | -   |

#### Reservation Station (ALU)

|    | ,     | SRC1 |     | SRC2  |     |     |  |
|----|-------|------|-----|-------|-----|-----|--|
|    | Valid | TAG  | VAL | Valid | TAG | VAL |  |
| A0 |       |      |     |       |     |     |  |
| A1 |       |      |     |       |     |     |  |
| A2 |       |      |     |       |     |     |  |

|    | ,     | SRC1 |     | SRC2  |     |     |
|----|-------|------|-----|-------|-----|-----|
|    | Valid | TAG  | VAL | Valid | TAG | VAL |
| M0 |       |      |     |       |     |     |
| M1 |       |      |     |       |     |     |
| M2 |       |      |     |       |     |     |











| mul | r1, | r2, | r3 | F D E |
|-----|-----|-----|----|-------|
| add | r4, | r2, | r4 | F D   |
| mul | r6, | r3, | r2 | F     |
| add | r5, | r4, | r6 |       |
| mul | r6, | r2, | r7 |       |

### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 0     | r1   | -   | MO  |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 0     | r4   | 4   | A0  |
| 1     | r5   | 5   | -   |
| 1     | r6   | 6   | -   |
| 1     | r7   | 7   | •   |

#### Reservation Station (ALU)

ALU

MUL

|    | ,     | SRC1 |     | SRC2  |     |     |  |
|----|-------|------|-----|-------|-----|-----|--|
|    | Valid | TAG  | VAL | Valid | TAG | VAL |  |
| Α0 | 1     |      | 2   | 1     |     | 4   |  |
| A1 |       |      |     |       |     |     |  |
| A2 |       |      |     |       |     |     |  |

|    | ,     | SRC1 |                       | SRC2  |      |      |  |  |  |
|----|-------|------|-----------------------|-------|------|------|--|--|--|
|    | Valid | TAG  | VAL                   | Valid | TAG  | VAL  |  |  |  |
| M0 | 1     | -    | 2                     | 1     | -    | 3    |  |  |  |
| M1 |       | Die  | Dispatch if possible! |       |      |      |  |  |  |
| M2 |       | פוט  | patt                  |       | 0055 | mie: |  |  |  |



### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | _   |
| 0     | r1   | -   | MO  |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 0     | r4   | 4   | A0  |
| 1     | r5   | 5   | -   |
| 0     | r6   | 6   | M1  |
| 1     | r7   | 7   | _   |

#### Reservation Station (ALU)

|    | ,     | SRC1 |     | ,     | SRC2 | )   | ALU        |
|----|-------|------|-----|-------|------|-----|------------|
|    | Valid | TAG  | VAL | Valid | TAG  | VAL |            |
| A0 | 1     | -    | 2   | 1     | -    | 4   | <b>→</b> > |
| A1 |       |      |     |       |      |     |            |
| A2 |       |      |     |       |      |     |            |

|    | ,     | SRC1 |     | SRC2  |     |     |  |
|----|-------|------|-----|-------|-----|-----|--|
|    | Valid | TAG  | VAL | Valid | TAG | VAL |  |
| M0 | 1     | ı    | 2   | 1     | -   | 3   |  |
| M1 | 1 -   |      | 3   | 1     |     | 2   |  |
| M2 |       |      |     |       |     |     |  |







### 晚岛州区外部几代

#### Register File (RF)

| Valid | Name | VAL | TAG        |
|-------|------|-----|------------|
| 1     | r0   | 0   | -          |
| 0     | r1   | -   | M0         |
| 1     | r2   | 2   | _          |
| 1     | r3   | 3   | -          |
| 0     | r4   | 4   | A0         |
| 0     | r5   | 5   | <b>A</b> 1 |
| 0     | r6   | 6   | M1         |
| 1     | r7   | 7   | -          |

## Reservation Station (ALU)

|    | ,     | SRC1 |     |    | ,   | SRC2 | <u> </u> |         | ALU |
|----|-------|------|-----|----|-----|------|----------|---------|-----|
|    | Valid | TAG  | VAL | Va | lid | TAG  | VAL      |         |     |
| A0 | 1     | ı    | 2   | 1  |     | ı    | 4        |         | >   |
| A1 |       | AO)  | 1   | 0  |     | M1   | 1        |         |     |
| A2 |       |      |     |    |     |      | rdal M   | -nt-e-2 |     |

|    | SRC1  |     |     | SRC2  |     |     |
|----|-------|-----|-----|-------|-----|-----|
|    | Valid | TAG | VAL | Valid | TAG | VAL |
| MO | 1     | -   | 2   | 1     | -   | 3   |
| M1 | 1     | -   | 3   | 1     | -   | 2   |
| M2 |       |     |     |       |     |     |





### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 0     | r1   | -   | MO  |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 1     | r4   | 6   | -   |
| 0     | r5   | 5   | A1  |
| 0     | r6   | 6   | M1  |
| 1     | r7   | 7   | -   |

### Reservation Station (ALU)

|   |                | SRC1  |           |     |       | SDCC | <u> </u> |
|---|----------------|-------|-----------|-----|-------|------|----------|
| ١ |                |       |           |     | SRC2  |      |          |
|   |                | Valid | TAG       | VAL | Valid | TAG  | V/AI     |
|   | A0             | 4     | 1         | 2   | 4     | \    | 4        |
|   | A1             | 1     | <b>A0</b> | 6   | 0     | M1   | ı        |
|   | A <sub>2</sub> |       |           |     |       |      |          |

#### Reservation Station (MUL)

|    | SRC1            |   |   | SRC2  |     |     |
|----|-----------------|---|---|-------|-----|-----|
|    | Valid TAG VAL \ |   |   | Valid | TAG | VAL |
| M0 | 1               | ı | 2 | 1     | 1   | 3   |
| M1 | 1               | _ | 3 | 1     | -   | 2   |
|    |                 |   |   |       |     |     |

#### MUL

A0 done

ALU



### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 0     | r1   | 1   | M0  |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 1     | r4   | 6   | -   |
| 0     | r5   | 5   | A1  |
| 0     | r6   | 6   | M2  |
| 1     | r7   | 7   | _   |

#### Reservation Station (ALU)

|    | SRC1  |     |     | SRC2  |     |     | ALU |
|----|-------|-----|-----|-------|-----|-----|-----|
|    | Valid | TAG | VAL | Valid | TAG | VAL |     |
| A0 |       |     |     |       |     |     |     |
| A1 | 1     | A0  | 6   | 0     | M1  | 1   | *   |
| A2 |       |     |     |       |     |     |     |

|    | SRC1            |   |   | SRC2  |     |     |
|----|-----------------|---|---|-------|-----|-----|
|    | Valid TAG VAL \ |   |   | Valid | TAG | VAL |
| M0 | 1               | ı | 2 | 1     | ı   | 3   |
| M1 | 1               | ı | 3 | 1     | ı   | 2   |
| M2 | 1               | - | 2 | 1     | 1   | 7   |





| mul r1, r2, r3 | FDEEEEEEE |
|----------------|-----------|
| add r4, r2, r4 | FDEEER    |
| mul r6, r3, r2 | FDEEEEE   |
| add r5, r4, r6 | <u> </u>  |
| mul r6, r2, r7 | FDEEE     |

#### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 1     | r1   | 6   | -   |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 1     | r4   | 6   | -   |
| 0     | r5   | 5   | A1  |
| 0     | r6   | 6   | M2  |
| 1     | r7   | 7   | -   |

### Reservation Station (ALU)

|    | SRC1  |     |     | SRC2  |     |     |
|----|-------|-----|-----|-------|-----|-----|
|    | Valid | TAG | VAL | Valid | TAG | VAL |
| A0 |       |     |     |       |     |     |
| A1 | 1     | A0  | 6   | 0     | M1  | 1   |
| A2 |       |     |     |       |     |     |

### Reservation Station (MUL)

|    | SRC1            |   |   | SRC2  |     |          |
|----|-----------------|---|---|-------|-----|----------|
|    | Valid TAG VAL V |   |   | Valid | TAG | VAL      |
| M0 | 4               | ı | 2 | 4     | ı   | <b>%</b> |
| M1 | 1               | - | 3 | 1     | -   | 2        |
| M2 | 1               | - | 2 | 1     | -   | 7        |





M0 done





#### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 1     | r1   | 6   | -   |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 1     | r4   | 6   | -   |
| 0     | r5   | 5   | A1  |
| 0     | r6   | 6   | M2  |
| 1     | r7   | 7   | -   |

#### Reservation Station (ALU)

|    | SRC1  |     |     |       | SRC2 | 2   |
|----|-------|-----|-----|-------|------|-----|
|    | Valid | TAG | VAL | Valid | TAG  | VAL |
| A0 |       |     |     |       |      |     |
| A1 | 1     | A0  | 6   | 1     | •    | 6   |
| A2 |       |     |     |       |      |     |

### Reservation Station (MUL)

|    | SRC1            |   |   | SRC2  |     |     |
|----|-----------------|---|---|-------|-----|-----|
|    | Valid TAG VAL \ |   |   | Valid | TAG | VAL |
| M0 |                 |   |   |       |     |     |
| M1 | 4               | • | 3 | 4     |     | 2   |
| M2 | 1               | - | 2 | 1     | -   | 7   |





M1 done





#### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 1     | r1   | 6   | 1   |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 1     | r4   | 6   | ı   |
| 0     | r5   | 5   | A1  |
| 0     | r6   | 6   | M2  |
| 1     | r7   | 7   | -   |

#### Reservation Station (ALU)

|            | SRC1  |     |     | SRC2  |     |     | ALU         |
|------------|-------|-----|-----|-------|-----|-----|-------------|
|            | Valid | TAG | VAL | Valid | TAG | VAL |             |
| A0         |       |     |     |       |     |     | <b>&gt;</b> |
| <b>A</b> 1 | 1     | A0  | 6   | 1     | -   | 6   |             |
| A2         |       |     |     |       |     |     |             |

|    | SRC1  |     |     | SRC2  |     |     |
|----|-------|-----|-----|-------|-----|-----|
|    | Valid | TAG | VAL | Valid | TAG | VAL |
| M0 |       |     |     |       |     |     |
| M1 |       |     |     |       |     |     |
| M2 | 1     | -   | 2   | 1     | -   | 7   |





 mul r1, r2, r3
 FDEEE

 add r4, r2, r4
 FDEE

 mul r6, r3, r2
 FDE

 add r5, r4, r6
 FD

 mul r6, r2, r7
 FD



### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 1     | r1   | 6   | -   |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 1     | r4   | 6   | -   |
| 0     | r5   | 5   | A1  |
| 1     | r6   | 14  | -   |
| 1     | r7   | 7   | -   |

### Reservation Station (ALU)

|    | SRC1  |     |     |       | SRC2 | 2   |
|----|-------|-----|-----|-------|------|-----|
|    | Valid | TAG | VAL | Valid | TAG  | VAL |
| A0 |       |     |     |       |      |     |
| A1 | 1     | Α0  | 6   | 1     | •    | 6   |
| A2 |       |     |     |       |      |     |

### Reservation Station (MUL)

|    | SRC1  |     |     | SRC2  |     |     |
|----|-------|-----|-----|-------|-----|-----|
|    | Valid | TAG | VAL | Valid | TAG | VAL |
| M0 |       |     |     |       |     |     |
| M1 |       |     |     |       |     |     |
| M2 | 4     | -   | 2   | 4     | •   | 7   |





M2 done



mul r1, r2, r3
add r4, r2, r4
mul r6, r3, r2
add r5, r4, r6
mul r6, r2, r7



#### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 1     | r1   | 6   | -   |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 1     | r4   | 6   | -   |
| 1     | r5   | 12  |     |
| 1     | r6   | 14  | -   |
| 1     | r7   | 7   | -   |

### Reservation Station (ALU)

|    |       | SRC'I         |   |   | SRC2 |     |
|----|-------|---------------|---|---|------|-----|
|    | Valid | Valid TAG VAL |   |   | TAG  | VAL |
| A0 |       |               |   |   |      |     |
| A1 | 4     | <del>A0</del> | 6 | 1 | -    | 6   |
| A2 |       |               |   |   |      |     |

#### Reservation Station (MUL)

|    | SRC1  |     |     | SRC2  |     |     |
|----|-------|-----|-----|-------|-----|-----|
|    | Valid | TAG | VAL | Valid | TAG | VAL |
| M0 |       |     |     |       |     |     |
| M1 |       |     |     |       |     |     |
| M2 |       |     |     |       |     |     |

#### MUL

A1 done

ALU



mul r1, r2, r3

FDEEEEEERW

add r4, r2, r4

mul r6, r3, r2

Add r5, r4, r6

mul r6, r2, r7

FDEEEEEEERW

FDEEEEERW

FDEEEEEERW

FDEEEEEERW

FDEEEEEERW

#### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 1     | r1   | 6   | -   |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 1     | r4   | 6   | -   |
| 1     | r5   | 12  | -   |
| 1     | r6   | 14  | -   |
| 1     | r7   | 7   | -   |

#### Reservation Station (ALU)

|    | SRC1          |  |       | SRC2 |     |   |
|----|---------------|--|-------|------|-----|---|
|    | Valid TAG VAL |  | Valid | TAG  | VAL |   |
| A0 |               |  |       |      |     |   |
| A1 |               |  |       |      | ·   | · |
| A2 |               |  |       |      |     |   |

#### Reservation Station (MUL)

|    | SRC1          |  |  | SRC2  |     |     |
|----|---------------|--|--|-------|-----|-----|
|    | Valid TAG VAL |  |  | Valid | TAG | VAL |
| M0 |               |  |  |       |     |     |
| M1 |               |  |  |       |     |     |
| M2 |               |  |  |       |     |     |





MUL



## OoO + RS + ROB

#### Register File (RF)

| Valid | Name | VAL | TAG |
|-------|------|-----|-----|
| 1     | r0   | 0   | -   |
| 0     | r1   | -   | R0  |
| 1     | r2   | 2   | -   |
| 1     | r3   | 3   | -   |
| 0     | r4   | 6   | R1  |
| 0     | r5   | 5   | R3  |
| 0     | r6   | 6   | R4  |
| 1     | r7   | 7   | -   |

#### Reservation Station (ALU)

|      | ,             | SRC1 |   | SRC2  |     |     |
|------|---------------|------|---|-------|-----|-----|
|      | Valid TAG VAL |      |   | Valid | TAG | VAL |
| A0 ( | 3000          |      |   |       |     |     |
| A1   | (1)           | R1   | 6 | 0     | R2  | -   |
| A2   |               |      |   |       |     |     |

#### Reservation Station (MUL)

|    |               | SRC1 |   | SRC2  |     |     |  |
|----|---------------|------|---|-------|-----|-----|--|
|    | Valid TAG VAL |      |   | Valid | TAG | VAL |  |
| MO | 1             | ı    | 2 | 1     | 1   | 3   |  |
| M1 | 1             | ı    | 3 | 1     | ı   | 2   |  |
| M2 | 1             | -    | 2 | 1     |     | 7   |  |

#### **Timeline**

| mul | r1, | r2, | r3 | FDEEEEE |
|-----|-----|-----|----|---------|
| add | r4, | r2, | r4 | FDEEER  |
| mul | r6, | r3, | r2 | FDEEE   |
| add | r5, | r4, | r6 | FD      |
| mul | r6, | r2, | r7 | FDE     |

#### Reorder Buffer

| Name | Valid | <b>Dst Name</b> | Value | Ready |
|------|-------|-----------------|-------|-------|
| R0   | 1     | r1              | -     | 0     |
| R1   | 1     | r4              | 6     | 1     |
| R2   | 1     | r6              | -     | 0     |
| R3   | 1     | r5              | -     | 0     |
| R4   | 1     | r6              | -     | 0     |

# Control dependency can hurt ILP

- We also suffer from control dependency (Some fetched instructions in RS can be potentially invalid)
- Control instructions occupy 14% of an average number of instructions
  - If there are 128 in-flight instructions → there are 18 branches
  - If we have a 90% correct branch predictor → (0.9)^18 → 15% chance of all the branches are correct
    - We only have 83% chance even with 99% correct branch predictor
  - We indeed need an extremely accurate branch predictor!!

## Question?

Announcements:

Reading: finish reading P&H Ch.4

Handouts: none